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Specification 

Methods For Detecting and Isolating Nuclear Transport Proteins 
Technical Field 

The present invention relates to methods of detecting and 
isolating nuclear transport proteins, and falls into the field of 
genetic engineering, more particularly gene cloning. 
Prior Art 

Various transcription factors, nuclear receptors, signal 
transfer factors, chromatin receptors, and the like are known as 
nuclear transport proteins. These proteins interact directly or 
indirectly with specific DNA regions in the vicinity of the end 
of intracellular signal transfer cascades to control gene 
expression, replication of DNA, and the like, and as a result, 
determine the behavior of cells. Accordingly, the isolation of 
the genes of these nuclear transport proteins and analysis of 
their functions is thought to be highly important from the 
viewpoints of elucidating various vital phenomena and developing 
new drugs. 

However, no specific method of comprehensively cloning cDNA 
coding for nuclear transport proteins has been developed; the 
general methods that have been applied in cloning techniques thus 
far are used. That is, when there is some information relating 
to a protein the cloning of which is being attempted -- for 
example, when there is a sequence that is stored at the amino 
acid level (Lichtsteiner , S., Proc. Natl. Acad. Sci., 1993, 90: 
9673-9677), an interacting DNA sequence is already known (Sanz, 
L., Mol. Cell. Biol., 1995, 15: 3164-3170; made by Clontech Co., 
Matchmaker One-Hybrid System) , or an interacting protein is 
already known — a cDNA library is cloned based on that 
information in these methods. However, in such cases, screening 
is possible only within an extremely limited range. 

12. 

It is known, for example, that the "Two-Hybrid System" 
(Gyuris, J., Cell, 1993, 75: 791-803; Golemis, E.A., Current 
Protocols in Molecular Biology (John Wiley & Sons, Inc.), 1996, 
Ch. 20.0 and 20.1) developed in recent years as a method of 
isolating interacting proteins can be employed by using as bait a 
protein already known to be present in the nucleus and thereby 
indirectly screen for cDNA coding proteins that interact with 
that protein (Jordan, K.L., Biochemistry, 1996, 35: 12,320- 
12,328). However, it cannot be directly employed as a method of 
screening cDNA coding for proteins that have transport activity 
into the nucleus. Further, even when employing bait in the form 

1 Numbers in the margin indicate pagination in the original. 
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of a protein known to be present in the nucleus, since it is not 
known whether transport into the nucleus occurs through 
interaction in the cytoplasm or through actual interaction in the 
nucleus, there is also the possibility that cDNA coding for 
proteins other than nucleoproteins will also end up being 
isolated. Thus, an arduous confirmation operation is necessary 
to determine whether or not the isolated cDNA codes for a nuclear 
transport protein. Further, since the "Two-Hybrid System" 
indicates interaction between proteins, there is also a problem 
in that the proteins obtained by screening end up being limited 
to just proteins capable of interacting with the protein employed 
as bait. 

When it is impossible to obtain information relating to the 
targeted protein in the manner described above, it is necessary 
to extract nuclear fractions from the cell, refine the targeted 
protein therefrom by a method employing functions such as 
specific biological activity possessed by that protein as 
indicators, and screen a cDNA library based on sequence 
information relating to the protein obtained (Ostrowski, J., J. 
Biol. Chem. , 1994, 269: 17,626-17,634). However, some nuclear 
transport proteins have extremely low expression levels, often 
necessitating the expenditure of considerable time and effort to 
refine, with some of them being nearly impossible to refine. 

Z3 

Disclosure of the Invention 

The present invention has as its object to provide methods 
of readily and efficiently detecting and isolating DNA coding for 
peptides having nuclear transport capability. 

One example of a nuclear transport protein is transcription 
factor. The transcription factor of eukaryotic organisms has the 
functions of migrating into the nucleus and inducing the 
expression of a specific gene by interacting with the promoter 
region of that specific gene. The nuclear migration ability of 
transcription factor is attributed to a nuclear migration signal 
present in the transcription factor. The present inventors 
focused on the two characteristics of transcription factor having 
the ability to migrate into the nucleus and the ability to 
activate transcription in a specific gene and conducted extensive 
research into resolving this issue. As a result, they discovered 
that when the region having the nuclear transportability in 
transcription factor was eliminated, an unknown peptide was 
introduced in place thereof, and the protein thus obtained was 
expressed within the cell, if the unknown peptide in the fused 
protein had the ability to migrate into the nucleus, it was 
transported with the fused protein into the nucleus, acted on a 
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particular promoter region, and was thought to induce the 
expression of a specific gene downstream therefrom. Further, if 
the unknown peptide in the fused protein did not have the ability 
to migrate into the nucleus, the fused protein did not migrate 
into the nucleus and was thought not to induce the expression of 
a specific gene in a downstream region of the promoter. That is, 
by means of a protein in which an unknown peptide had been fused 
into transcription factor not having the ability to migrate into 
the nucleus, it was thought to be possible to determine whether 
or not the unknown peptide in the fused protein had the ability 
to migrate into a nucleus based on indication in the form of 
inducement of the expression of a particular gene downstream from 
the promoter. 

Accordingly, based on this idea, the present inventors 
actually prepared fused DNA of test DNA and DNA coding for 
transcription factor from which the region having the ability to 
migrate into the nucleus had been removed, introduced into a 
eukaryotic host maintaining in the nucleus a promoter region 
activated by the binding of transcription factor and a reporter 
gene the expression of which was induced by activation of that 
promoter region, and detected the expression of the reporter 
gene. As a result, they discovered that when DNA coding for a 
peptide having the ability to migrate into the nucleus was 
employed as test DNA, expression of the reporter gene was 
induced, and when DNA coding for a peptide not having the ability 
to migrate into the nucleus was employed as test DNA, expression 
of the report gene was not induced. 

LA 

The present inventors further prepared a library of cDNA 
coding for fused proteins of transcription factor from which the 
region -having the ability to migrate into nucleuses had been 
removed and other peptides, and introduced these into cells to 
screen cDNA coding for peptides having nuclear transportability 
employing the expression of the reporter as indicator. As a 
result, the present inventors discovered that most of the known 
cDNA isolated from the cDNA library coded for proteins thought to 
have nuclear transportability. 

That is, the present invention relates to methods of readily 
and efficiently detecting and isolating DNA coding for peptides 
having nuclear transportability using the properties of 
transcription factor, and more particularly, relates to: 

(1) A method of detecting the nuclear transportability of a 
peptide coded for by test DNA, characterized in that fused DNA of 
DNA coding for transcription factor not having nuclear 
transportability and test DNA is introduced into a eukaryotic 
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host having in its nucleus a promoter region that is activated by 
binding of the transcription factor and a reporter gene connected 
downstream from the promoter region, and detecting expression of 
the reporter gene; 

(2) The method described in (1) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA bonding domain, and a 
transcription activation domain; 

(3) The method described in (1) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and a GAL4 
transcription activation domain, and wherein the promoter region 
that is activated by binding of the transcription factor is the 
promoter region of a GAL1 gene in which the operator sequence has 
been replaced with the LexA operator sequence; 

(4) The method described in (3) wherein the nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5; 

(5) The method described in any of (l)-(4) wherein the 
reporter gene is the LEU2 and/or (3-galactosidase gene; 

Z5 

(6) A method of isolating DNA coding for a peptide having 
nuclear transportability characterized in that fused DNA of DNA 
coding for transcription factor not having nuclear 
transportability and test DNA is introduced into a eukaryotic 
host having in its nucleus a promoter region that is activated by 
the binding of transcription factor and a reporter gene connected 
downstream from the promoter region; expression of the reporter 
gene is detected; and test DNA is isolated from a eukaryotic host 
in which expression has been detected; 

(7) The method described in (6) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA binding domain, and a 
transcription activation domain; 

(8) The method described in (6) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and a GAL4 
transcription activation domain, and wherein the promoter region 
that is activated by binding of the transcription factor is the 
promoter region of a GAL1 gene in which the operator sequence has 
been replaced with the LexA operator sequence; 

(9) The method described in (8) wherein the nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5; 

(10) The method described in any of (6) -(9) wherein the 
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reporter gene is the LEU2 and/or (3-galactosidase gene; 

(11) A vector having an incorporation site of test DNA 
adjacent to DNA coding for transcription factor not having 
nuclear transportability; 

(12) The vector described in (11) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA binding domain, and a 
transcription activation domain; 

(13) The vector described in (11) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and the GAL4 
transcription activation domain; 

(14) The vector described in (13) wherein the nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5; 

JA 

(15) A kit comprising: (1) a vector having an incorporation 
site for test DNA adjacent to DNA coding for transcription factor 
not having nuclear transportability; and (2) a eukaryotic host 
having in its nucleus an expression unit comprising a promoter 
region activated by binding of the transcription factor and a 
reporter gene connected downstream from the promoter region; 

(16) The kit described in (15) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA binding domain, and a 
transcription activation domain; 

(17) The kit described in (15) wherein the transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and a GAL 4 
transcription activation domain; wherein the promoter region that 
is activated by binding of the transcription factor is the 
promoter region of a GAL1 gene in which the operator sequence has 
been replaced with the LexA operator sequence; and wherein the 
eukaryotic host is yeast; 

(18) The kit described in (17) wherein the nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5; and 

(19) The kit described in any of (15) -(18) wherein the 
reporter gene is the LEU2 and/or p-galactosidase gene. 

In the present invention, the term "transcription factor" 
means a protein having a DNA binding domain and a transcription 
activation domain that activates the transcription of a specific 
gene, regardless of whether or not it occurs naturally. Further, 
in the present invention, the term "peptide" includes, in 
addition to proteins, partial peptides of proteins, synthetic 
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peptides, and the like. 

The present invention relates first to a method of detecting 
the nuclear transportability of a peptide coded for by test DNA 
characterized in that fused DNA of DNA coding for transcription 
factor not having nuclear transportability and test DNA is 
introduced into a eukaryotic host having in its nucleus a 
promoter region that is activated by binding of the transcription 
factor and a reporter gene the expression of which is induced by 
activation of the promoter region, and detecting expression of 
the reporter gene. 

II 

In the present invention, the transcription factor employed 
in the preparation of "transcription factor not having nuclear 
transportability" is not specifically limited other than that it 
be capable of specifically controlling the expression of a gene 
in a eukaryotic organism; examples of transcription factor 
suitable for use are GAL 4 (Giniger, E . , Cell, 1985, 40: 767- 
774), p53 (Chumakov, P. M . , Genetika, 1988, 24: 602-612), GCN4 
(Hinnenbusch, A. G., Proc. Natl. Acad. Sci., 1984, 81: 6,442- 
6,446), VP16 (Triezeneberg, S. J., Genes. Dev., 1988, 2: 718- 
729), RelA (Nolan, G. P., Cell, 1991, 64: 961-969), Oct-1 
(Strum, R. A., Genes. Dev., 1988, 2: 1,582-1,599), c-Myc (Watt, 
R., Nature, 1983, 303: 725-728), c-Jun (Angel, P., Cell, 1988, 
55: 875-885), MyoD (Write, W. E . , Cell, 1989, 56: 607-617), and 
the like. 

The "transcription factor not having nuclear 
transportability" of the present invention is not particularly 
limited other than that it be transcription factor not having 
nuclear transportability (or having extremely low nuclear 
transportability) and having transcription activation ability and 
DNA binding ability. Examples are transcription factors in which 
the nuclear transport signal has been eliminated or replaced with 
other amino acids, transcription genes that are fused proteins 
comprising DNA binding domains and transcription activation 
regions, and the like. 

Substances of low molecular weight (molecular weights not 
greater than 40,000 daltons) are generally thought to move by 
diffusion into nuclear holes other than by specific active 
transport systems. Even when the active nuclear transportability 
of transcription factor is eliminated due to loss or substitution 
of the nuclear transport signal, the movement of the substance 
into the nucleus by diffusion still occurs. In such cases, a 
signal can be added by localization of a protein in the cell 
outside the nucleus, thereby permitting complete or minimal 
control of substance movement into the nucleus by diffusion. The 
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"transcription factor not having nuclear transportability" of the 
present invention includes transcription factor to which is added 
in this manner a localized signal inside the cell but outside the 
nucleus. Examples of localized signals inside the cell but 
outside the nucleus are nuclear export signals (Gorlich, D., 
Science, 1996, 271: 1,513-1,518), secretion signals, peroxysome 
transport signals, rough-surfaced endoplasmic reticuli transport 
signals, mitochondria movement signals (Nakai, K., Genomics, 
1992, 14: 897-911; Nakai, K., PSORT WWW server, 

http: //psort . nibb . ac. jp/ ) , and the like; the present invention is 
not limited thereto. 

Z£ 

Further, there are transcription factors having multiple 
nuclear transport signals and transcription factors in which it 
is entirely impossible to specify the position of a nuclear 
transport signal within the molecule despite the observation of 
nuclear transportability (GAL4 , p. 53, and the like (TANAKA, 
Mahito, Cell Science (Japanese), 1991, 7: 265-272)). Further, 
there are transcription factors in which nuclear transport 
signals overlap with DNA binding domains or transcription 
activation domains, so that the loss or replacement of the 
nuclear transport signal may result in the loss of even DNA 
binding ability or transcription activation ability. When 
employing such transcription factors, even when it is impossible 
to completely specify the nuclear transport signal sequence, it 
suffices to specify the region required for eliminating nuclear 
transportability and remove or replace this region to prepare 
transcription regulating factor not having nuclear 
transportability. Further, an artificial hybrid transcription 
factor in which the DNA binding domain of a protein derived from 
a eukaryotic or prokaryotic organism that is known not to contain 
a nuclear transport signal and a transcription activation domain 
that is known not to contain a nuclear transport signal can be 
created to prepare transcription factor. The "transcription 
factor not having nuclear transportability" in the present 
invention includes transcription factor thus prepared. 

The transcription activation domain employed in the 
preparation of the transcription factor not having nuclear 
transportability of the present invention includes, but is not 
limited to, the GAL 4 transcription activation domain (Brent, R. , 
Cell, 1985, 43: 729-736), Bicoid, c-Fos, c-Myc, v-Myc, B6, B7, 
B42 (Golemis, A. E . , Mol. Cell. Biol., 1992, 12: 3,006-3,014), 
GCN4 (Hope, I. A., Cell, 1986, 46: 885-894), and VP16 (Clontech 
Co., Mammalian MATCHMAKER Two-Hybrid Assay Kit). The DNA binding 
domain includes, but is not limited to, GAL 4 (Giniger, E., Cell, 
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1985, 40: 767-774), p53 (Chumakov, P. M. , Genetika, 1988, 24: 
602-612), GCN4 (Hinnenbusch, A. G., Proc. Natl. Acad. Sci., 1984, 
81: 6,442-6,446), VP16 (Triezeneberg, S. J., Genes Dev., 1988, 
2: 718-729), RelA (Nolan, G. P., Cell, 1991, 64: 961-969), Oct-1 
(Strum, R. A., Genes. Dev., 1988, 2: 1,582-1,599), c-Myc (Watt, 
R., Nature, 1983, 303: 725-728), c-Jun (Angel, P., Cell, 1988, 
55: 875-885), MyoD (Write, W. E., Cell, 1989, 56: 607-617), and 
other DNA binding domains that have been identified in 
transcription factor. 

Z9 

" DNA coding for transcription factor not having nuclear 
transportability // can be prepared, for example, by the method of 
partially or completely removing the DNA sequence coding for 
nuclear transport signals from the DNA coding for transcription 
factor, the method of replacing the sequence within the nuclear 
transport signal by the incorporation of a site-specific 
variation, the method of adding a localized signal outside the 
nucleus but inside the cell, the method of fusing the 
transcription activation domain with the DNA binding domain, and 
suitable combinations of these methods. The general genetic 
operations in these methods are described in the literature 
(Sambrook, J., Molecular Cloning: A Laboratory Manual, 1989, 2 nd 
Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
NY) . 

The "test DNA" employed in the method of the present 
invention includes cDNA, genome DNA, synthetic DNA, and the like 
that is not specifically limited beyond that it be DNA coding for 
a protein or a component peptide thereof. DNA coding for 
transcription factor not having nuclear transportability can be 
fused with test DNA by the usual methods (Sambrook, J., Molecular 
Cloning: A Laboratory Manual, 1989, 2 nd Ed., Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY) . 

The fused DNA of DNA coding for transcription factor not 
having nuclear transportability and test DNA is usually inserted 
into a suitable expression vector and introduced into an 
eukaryotic host. The expression vector is not particularly 
limited other than that it be capable of stably expressing a 
protein coded for by the fused DNA of DNA coding for specific 
transcription factor from which the nuclear transportability has 
been eliminated and test DNA; however, an expression vector 
functioning as a shuttle vector stably maintained by both the 
host and E. coli is preferred. For example, when baker's yeast 
is employed as 

Z10 

the host, a unit for the expression of the protein (where the 



expression unit comprises a promoter region functioning within 
yeast (the promoter region of ADH1, GAL1, or the like) , an 
expressed protein code region, a multicloning site, and a 
terminator region (the terminator region ADH1 or the like)) can 
be incorporated for use into an embedded vector that is embedded 
in a yeast chromosome not having a replication starting point 
within the vector, a plasmid vector (centromea vector (low copy) , 
2]i vector (high copy) , and the like are commercially available) 
present as a plasmid and having a replication starting point 
within the vector. Specifically, embedded vectors and centromea 
vectors are commercially available from Stratagene Co. as "pRS 
vectors" having various nutritional requirement marker genes 
(LEU2, HIS3, URA3, TRP1, and the like) for complementing the 
nutritional requirements of the host. Variant host strains 
corresponding to the respective marker genes are included as 
kits. Various commercially available vectors (Stratagene Co.'s 
HybriZapII, GAL 4 Two-Hybrid Phagemid vector, Clontech Co.'s 
Matchmaker vector, and the like) employed in "Two-Hybrid systems" 
having nutritional requirement marker genes (LEU2, HIS3, URA3, 
TRP1, and the like) for complementing the nutritional 
requirements of the host, and the variant host strains 
corresponding to the respective vectors, can be employed as 2]i 
vectors. When employing animal cells as host, commercially 
available common mammal expression vectors in the form of vectors 
embedded in chromosomes (such as pMAM, pMAM-neo, and the like 
from Clontech Co.) or vectors maintained as episome (ADR2, pDR2 
vector systems and the like from Clontech Co.) may be combined 
with suitable host animal cells (CHO, Mouse Fibroblast, Hela, 
U937, BHK, and the like) for use. The vector pMT2 and the like 
for transient expression using COS cells or the like (Sambrook, 
J., Molecular Cloning: A Laboratory Manual, 1989, 2 nd Ed., Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY) may also 
be employed. The insertion of the above-described fused DNA into 
the expression vector may be conducted by the usual methods 
(Sambrook, J., Molecular Cloning: A Laboratory Manual, 1989, 2 nd 
Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
NY) . 

Further, the "eukaryotic host" into which the above- 
described fused DNA is incorporated in the present invention is 
not specifically limited other than that it be a eukaryotic host 
capable of stably expressing proteins coded 

/ll 

for by the above-described fused DNA. However, from the 
perspectives of convenience of handling, ease of incorporation 
and recovery of genes, safety, and the like, yeast and animal 
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cultured cells are particularly desirable. The eukaryotic host 
employed in the present invention has within its nucleus a 
promoter region that is activated by the binding of a specific 
transcription factor and a reporter gene connected downstream 
from this promoter region. 

The "promoter region that is activated by the binding of a 
specific transcription factor" includes an upstream activating 
sequence (UAS) for binding of transcription factor or a cys 
control region called an operator sequence and a TATA sequence 
and is not particularly limited other than that it be a promoter 
region that is activated to specific transcription when 
transcription factor binds to the UAS. For example, in the case 
of baker's yeast, an example of a cys control region is natural 
GAL1 UAS (comprising four GAL 4 binding sequences), artificial 
GAL1 UAS (comprising three GAL 4 binding sequences) , LexA UAS 

(comprising 1-8 LexA binding sequences) (Estojak, J., Mole. Cell. 
Biol., 1995, 15: 5,820-5,829). Further, examples of TATA 
sequences are GALl TATA, CYC1 (cytochrome CI) TATA, LEU2 TATA, 
and HIS3 TATA. These cys control regions and TATA sequences can 
be combined to construct various promoter regions of differing 
expression levels and inducement conditions (Clontech Co., Yeast 
Protocols Handbook, PT3024-1: 5-8). That is, a promoter region 
in which a transcription factor binding sequence is present in 
the cys control region and the activity of the promoter is 
controlled by the transcription factor suffices. 

Further, in baker's yeast, the genetic analysis of which is 
quite advanced, the use as reporter gene of a gene relating to 
the nutritional requirements of the host (LEU2, HIS3, TRP1, URA3, 
or the like) , a gene (such as GALl) relating to the exploitation 
of required nutritional sources, or a gene compensating for the 
loss or damage of some other gene required for survival makes it 
possible to readily detect the expression of the gene through the 
survival or death of the host. It is also possible to employ a 
generally known reporter gene that can be detected by the 
activity of an enzyme such as |3-galactosidase, chloramphenicol 
acetyltransf erase, or luciferase, or green fluorescent protein 

(from Clontech Co.) permitting the 

111 

direct detection of fluorescent light while [the transf ormants 
are] alive. Further, the above-described general-use reporter 
genes as well as drug-resistance genes may be employed in animal 
cells . 

The above-described promoter region and reporter gene may be 
spliced by the usual methods (Sambrook, J., Molecular Cloning: A 
Laboratory Manual, 1989, 2 nd Ed., Cold Spring Harbor Laboratory 
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Press, Cold Spring Harbor, NY) . 

For example, when employing baker's yeast as host, the usual 
methods, such as the lithium acetate method (Clontech Co., Yeast 
Protocols Handbook, PT3024-1: 17-20), can be used for the 
genetic introduction of the promoter region activated by the 
binding of transcription factor and the reporter gene that is 
spliced downstream from this promoter region. Based on 
differences in the vector employed (either the above-described 
embedded vector or a plasmid vector) , the target gene may be 
selectively incorporated onto a chromosome or placed within the 
nucleus as a plasmid. Gene introduction is also possible by the 
usual methods, such as the ribosome method (Sambrook, J., 
Molecular Cloning: A Laboratory Manual, 1989, 2 nd Ed., Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY), in 
animal cells. Based on differences in the vector employed (either 
the above-described embedded vector or an episome vector) , the 
target gene may be selectively incorporated onto a chromosome or 
placed within the nucleus as an episome. 

Further, a commercially available eukaryotic host organism 
may be employed as the eukaryotic host organism having the above- 
described promoter region and reporter gene within the cell. For 
example, when employing LexA as the transcription factor DNA 
binding domain, the yeast EGY48 [p80P-lacZ ] (available from 
Clontech Co.) characterized by comprising a promoter region 
having the LexA operator sequence, LEU2, which is the downstream 
reporter gene thereof, and [3~galactosidase on the chromosome and 
on plasmids, respectively, can be employed. 

For example, when employing baker's yeast as host, the 
vector containing the fused DNA of DNA coding for specific 
transcription factor the nuclear transportability of which has 
been eliminated and test DNA can be incorporated into the 
eukaryotic host by the usual methods, such as the lithium acetate 
method (Clontech Co., Yeast Protocols 

Z13 

Handbook, PT3024-1: 17-20) . Based on differences in the vector 
employed (either the above-described embedded vector or a plasmid 
vector) , the target gene may be selectively incorporated onto a 
chromosome or placed within the nucleus as a plasmid. Gene 
introduction is also possible by the usual methods, such as the 
ribosome method (Sambrook, J., Molecular Cloning: A Laboratory 
Manual, 1989, 2 nd Ed.-, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY), in animal cells. Based on differences in the 
vector employed (either the above-described embedded vector or an 
episome vector) , the target gene may be selectively incorporated 
onto a chromosome or placed within the nucleus as an episome. 
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For example, in baker's yeast, the genetic analysis of which 
is quite advanced, the use as reporter gene of a gene relating to 
the nutritional requirements of the host (LEU2, HIS3, TRP1, URA3, 
or the like) , a gene (such as GAL1) relating to the exploitation 
of required nutritional sources, or a gene compensating for the 
loss or damage of some other gene required for survival makes it 
possible to readily detect the expression of the reporter gene in 
the transformant thus obtained through the survival or death of 
the host. It is also possible to employ a generally known 
reporter gene that can be detected by the activity of an enzyme 
such as p-galactosidase, chloramphenicol acetyltransf erase, or 
luciferase, or green fluorescent protein (from Clontech Co.) 
permitting the direct detection of fluorescent light emitted by 
living cells. Further, the above-described general-use reporter 
genes as well as drug-resistance genes may be employed in animal 
cells to detect expression. As a result, if expression of the 
reporter gene is detected, it may be concluded that the test DNA 
codes for a peptide having nuclear transportability, and if 
expression of the reporter gene is not detected, it may be 
concluded that the test DNA does not code for a peptide having 
nuclear transportability . 

Zll 

Second, the present invention relates to a method of 
isolating test DNA coding for a peptide having nuclear 
transportability, characterized in that fused DNA of DNA coding 
for transcription factor not having nuclear transportability and 
test DNA is introduced into a eukaryotic host having in its 
nucleus a promoter region that is activated by the binding of 
transcription factor and a reporter gene connected downstream 
from the promoter region; expression of the reporter gene is 
detected; and test DNA is isolated from a eukaryotic host in 
which expression has been detected. The test DNA can be isolated 
from a eukaryotic host in which expression of the reporter gene 
has been detected by, for example, in the case of baker's yeast 
when the test DNA is present on a plasmid (yeast-E. coli shuttle 
vector) , refining plasmid from a single colony, using the plasmid 
obtained to transform E. coli, and further refining plasmid from 
the transformant. Alternatively, complete DNA from a single 
colony can be refined, and the refined DNA used as a template to 
amplify and refine the test DNA by PCR (Clontech Co., Yeast 
Protocols Handbook, PT3024-1: 29-37). As regards animal cells, 
as well, complete DNA is basically refined from a single colony 
and employed as a template to amplify and isolate the test DNA by 
PCR. 

The present invention further relates to a kit comprising: a 
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vector having an incorporation site for test DNA adjacent to DNA 
coding for transcription factor not having nuclear 
transportability; and a eukaryotic host having in its nucleus an 
expression unit comprising the vector, a promoter region binding 
transcription factor, and a reporter gene spliced downstream from 
the promoter region. Test DNA is introduced at a test DNA 
introduction site in the vector of the present invention, and the 
vector is introduced into a eukaryotic host having in its nucleus 
an expression unit comprising a promoter region binding the 
transcription factor and a reporter gene spliced downstream from 
the promoter region. The test DNA introduction site is usually 
the only site on the vector that can be cleaved by a specific 
control enzyme. When expression of the reporter gene in the 
eukaryotic host is detected as a result of the introduction of 
the vector into the eukaryotic host, it is concluded that the 
test DNA that has been introduced into the vector codes for a 
peptide having nuclear transportability, and when expression of 
the reporter gene is not detected, it is concluded that the test 
DNA that has been introduced into the vector does not code for a 
peptide having nuclear transportability. Thus, it is readily 
possible to determine whether or not the test DNA codes for a 
peptide having nuclear transportability, and the isolation of DNA 
coding for peptides having nuclear transportability 

Z15 

is readily accomplished. Specifically, a DNA library can be 
built with the above-described vectors, this library can be 
introduced into the above-described eukaryotic hosts, and the 
expression of reporter genes can be detected to efficiently and 
comprehensively isolate DNA coding for peptides having nuclear 
transportability from within the library. 

Brief Description of the Figures 

Fig. 1 shows the plasmid "pLexAD" . 
Fig. 2 shows the plasmid "pLexADrev" . 
Fig. 3 shows the plasmid "pRSIF" . 
Fig. 4 shows the plasmid "pRS3F" . 

Fig. 5 shows an assay for nuclear transportability in 
transcription factor by fusion with a test peptide. 

Fig. 6 shows an assay for nuclear transportability in 
transcription factor fused to a test peptide. 

Fig. 7 shows the plasmid "pNS" . 

Fig. 8 shows an assay for nuclear transportability in 
various peptides using the plasmid "pNS" . 
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Best Mode of Implementing the Invention 

Embodiments of the present invention are specifically 
described below; however, the present invention is not limited to 
these embodiments. In the embodiments set forth below, except 
where specifically stated otherwise, the basic genetic 
engineering methods employed were those described in the 
literature (Sambrook, J., Molecular Cloning: A Laboratory 
Manual, 1989, 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY) . Restriction enzymes, other modifying 
enzymes, and other genetic engineering products were purchased 
from Hoshuzo and the use conditions of the respective 
accompanying manuals were adhered to. Further, a "QIAprep Kit" 
(from Qiagen Co.) was employed to refine the plasmids from E. 
coli. An "ABI Prism 377" (from Perkin Elmer Co.) was employed to 
verify base sequences. Reagents 
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from that company were employed to prepare samples for analysis 
and the methods employed conformed to the product manuals. 
Handling of the yeast (culture media, host, shuttle vectors, gene 
introduction methods, reporter gene assay method, gene isolation, 
and the like) was conducted with a "Matchmaker LexA Two-Hybrid 
System" (from Clontech Co.) according to the accompanying "Yeast 
Protocols Handbook" . Synthesis of custom oligonucleotide was 
farmed out to Toa Gosei Co. 

[Embodiment 1] Preparation of DNA sequences coding for GAL4 
transcription activation domain by PCR 

(1) Amplification by PCR of DNA sequences coding for the 
GAL 4 transcription activation domain 

A DNA fragment comprising the GAL 4 transcription activation 
domain (the base sequence of which is given by sequence number 3) 
was amplified with the "GeneAmp PCR System 2400" (Perkin Elmer 
Co.) using a template in the form of "Plasmid pACT2" (Clontech 
Co.) and primers in the form of "Primer NU13" (sequence number 1) 
with an add-in EcoRI site designed into the 5' end and 
"Matchmaker 3' AD LD-Insert Screening Amplimer" (sequence number 
2) (Clontech Co.). "TaKaRa Ex Taq" (TaKaRa Co.) was employed as 
the Taq polymerase and the product manual was adhered to for the 
reaction conditions and the like. The DNA fragment that had been 
amplified in this manner was refined by precipitation from 
ethanol and digested with the restriction enzymes EcoRI and Ncol. 
Six percent polyacrylamide gel electrophoresis was conducted and 
the targeted DNA fragment was cut out of the gel and recovered by 
electroelution . 

(2) Preparation of the vector "pLexAD" expressing a fused 
protein of the LexA protein and GAL 4 transcription activation 
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domain 

The DNA fragment of (1) above coding for the GAL 4 
transcription activation domain was inserted between the EcoRI 
site and the Ncol site in the multicloning sites of the plasmid 
"pLexA" (Clontech Co.) to build " pLexAD" (Fig. 1). The base 
sequence was determined to verify that the targeted segment had 
indeed been inserted. The base sequence of the LexA gene is 
given by sequence number 4 . 

(3) Preparation of the vector "pLexADrev" in which the 
nuclear export signal (NES) is inserted on the N end of LexA 

111 

A nuclear export signal (sequence number 5) having the Rev 
protein of HIV was synthesized in the following manner and 
inserted at the Hpal site near the N end of the LexA protein 
coded for by "pLexAD" . "NU9" (sequence number 6) was 
synthesized as the sense chain and "NU10" (sequence number 7) as 
the antisense chain, the two were phosphorylated on the 5' end 
with T4 polynucleotide kinase, and the two were annealed. The 
DNA fragment was inserted into "pLexAD" that had been predigested 
with HPal and dephosphorylated with alkali phosphatase to 
construct ^pLexADrev" (Fig. 2) . The base sequence was determined 
to verify that the targeted segment had indeed been inserted. 

(4) Construction of the plasmid "pRSIF" having a CEN/ARS 
region at a replication starting point for the expression of a 
fused protein of LexA protein and a GAL4 transcription activation 
domain and construction of the plasmid M PRS3F' having a CEN/ARS 
region at a replication starting point for the expression of a 
fused protein of LexA protein with an inserted NES and a GAL4 
transcription activation domain 

The minimum unit required for the expression in yeast of a 
fused protein of common LexA protein not having an inserted 
nuclear export signal (NES) and a GAL4 transcription activation 
domain, and a fused protein of LexA protein with an NES inserted 
at the N end and a GAL4 transcription activation domain (the base 
sequence in which the amino acid sequence of this fused protein 
is recorded in combined form is given in sequence number 8), is a 
DNA fragment of about 1.7 kb obtained by digesting "pLexAD" with 
SphI for the former, and digesting "pLexADrev" with SphI for the 
latter. This expression unit comprises an ADH1 promoter region, 
an expression protein coding region, a multicloning site, and an 
ADH1 terminator region. After refining the DNA fragments of 
these respective expression units, the portion of a PvuII 
digested fragment comprising in advance the multicloning site of 
the plasmid "pRS413" (Stratagene Co.) (yeast shuttle vector, 
CEN/ARS origin) was inserted at the SphI site of the vector 
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"pRSF" that had been substituted with PvuII digested fragment 
comprising the multicloning site of the widely used plasmid pUC19 
to construct "pRSIF" and "pRS3F" (Figs. 3 and 4, respectively). 
The base sequence was determined to verify that the targeted 
segment had indeed been inserted. Since 
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a pLexA-derived multicloning site was present immediately 
following the fused protein functioning as a transcription factor 
in the "pRSIF' (positive control) and "pRS3F" constructed in this 
manner, a targeted DNA fragment such as cDNA could be readily 
fused by the usual methods and expressed. 

[Embodiment 2] Validation of the effectiveness of the 
nuclear transport protein trap vector "pRS3F" by fusion of 
artificial nuclear transport protein cDNA 

(1) Fusion of a known cDNA fragment 

A cDNA fragment coding for the branch strand amino acid 
binding protein ('BraC) of Pseudomonas aeruginosa from which the 
secretion signal observed to be locally present in the cytoplasm 
had been removed (a base sequence in which the amino acid 
sequence of this protein is recorded in combination is shown in 
sequence number 9) (TANAKA, Mahito, New Biochemistry Experiment 
Lecture 6 (Ed. by the Japan Biochemistry Society) , Biomembranes 
and Membrane Transport (2/2), 1992, Tokyo Chemistry Club, 9 15) 
and an artificial nuclear transport protein with SV40 large T 
antigen-derived nuclear transport signal fused onto its N end was 
fused in-frame onto the C end of the GAL4 transcription 
activation domain of "pRS3F' as a known cDNA fragment. More 
precisely, "pRS3F' BraC" was constructed by inserting the DNA 
fragment (Ncol-Dral) coding for "'BraC" into "pRS3F" that had 
been refined by digestion with Xhol, Klenow treatment to smooth 
off the ends, and digestion with Ncol . "pRS3FN' BraC" was then 
constructed by inserting a synthetic DNA fragment coding for a 
nuclear transport signal (sequence number 10) derived from SV40 
large T antigen, that is, synthesizing "NU17" as sense strand 
(sequence number 11) and "NU18" as antisense strand (sequence 
number 12), phosphorylating the 5' ends thereof with T4 
polynucleotide kinase, and annealing the two, into a vector 
obtained by refining this "pRS3F' BraC" by digestion with Nhel and 
Ncol. Further, as a control test, xx pRS3FN" having only a nuclear 
transport signal and no BraC" fragment was constructed in the 
same manner. Correct insertion of the targeted fragment was 
confirmed by determining the base sequence. 

(2) Nuclear transport capability assay based on reporter 
gene expression] 

111 
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The three plasmids "pRS3F' BraC" , "pRS3FN' BraC" , and pRS3FN" 
described in (1) above and the "pRSIF" and ' x pRS3F" constructed in 
Embodiment 1 were used to transform host yeast EGY48 [p80P-lacZ] 
(obtained from Clontech Co.) having a promoter region (sequence 
number 13) having the LexA operator sequence (Estojak, J., Mole. 
Cell. Biol., 1995, 15: 5,820-5,829), the LEU2 reporter gene 
downstream therefrom, and (3-galactosidase on chromosomes and on 
plasmid, respectively. Introduction into the host of the 
targeted plasmids was confirmed by complementation of HIS, a 
nutritional requirement marker. Next, the respective 
transf ormants were replicated in culture media (SD/-LEU, -HIS, - 
URA, X-gal) to assay expression of the reporter gene and cultured 
for 2-3 days at 30'C. As a result, both the reporter gene (3- 
galactosidase and LEU2 were expressed, and blue coloration and 
normal development were confirmed, in the transf ormants into 
which had been introduced M pRS3FN' BraC" fused with artificial 
nuclear transport protein, "pRS3FN" fused with only nuclear 
transport signal, and "pRSIF" as a positive control (Figs. 5 and 
6) . By contrast, almost no reporter gene was expressed and 
neither blue coloration nor growth was observed in the 
transf ormants into which had been introduced M pRS3F' BraC" fused 
with a protein having no nuclear transport signal and "pRS3F" 
that had not been fused with anything (Figs. 5 and 6). 

From these results, the in-frame fusion of a DNA fragment 
coding for a certain peptide onto the C end of transcription 
factor coding for "pRS3F" and the expression thereof in yeast 
permitted the detection of the presence or absence of nuclear 
transportability by using the expression of the reporter gene as 
indicator . 

[Embodiment 3] Construction of the vector pNS for creating a 
cDNA library 

"pRS3F" was improved. The improvement consisted of the 
following three points: (1) elimination of the EcoRI sites in 
the LexA and GAL 4 AD binding portions, (2) the introduction of an 
EcoRI site at the multicloning site, and (3) the elimination of 
unneeded regions derived from "pRS413" to achieve the smallest 
size possible. 

First, a synthesis linker, "NU31" as sense strand (sequence 
number 14), and "NU30" as antisense strand (sequence number 15), 
were inserted at the EcoRI site of "pLexADrev" to obtain the 
plasmid "pLexADrev-dE" . 

720 

A DNA fragment comprising an about 1.7 kb ADH1 expression unit 
obtained by digestion of "pLexADrev-dE" with the restriction 
enzyme SphI was subcloned at the SphI site of the widely employed 
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plasmid pUC19 to obtain the plasmid "pULexADrev-dE" . Next, a 
synthesis linker having an EcoRI site, a sense strand in the form 
of "NU28" (sequence number 16) , and an antisense strand in the 
form of "NU29" (sequence number 17) were inserted between the 
Nhel site and the Ncol site of the "pULexADrev-dE" to obtain the 
plasmid " pULexADrev-E" . Further, xx pRS413" was digested with 
Drain and PvuII to remove a DNA fragment comprising a 757 bp 
multicloning site, and a synthesis linker having an SpHI site, a 
sense strand in the form of "NU25" (sequence number 18), and an 
antisense strand in the form of "NU26" (sequence number 19) were 
inserted at the removal site to obtain the plasmid xx pRS-S" . A 
DNA fragment comprising an about 1.7 kb ADH1 expression unit 
obtained by digesting the above-described "pULexADrev-E" with 
SphI was inserted at the SphI site of the "pRS-S" to construct 
the vector pNS (fig. 7) for use in creating a cDNA library (the 
transcription direction of ADH1 is identical to that of HIS3) . 

[Embodiment 4] Creation of a fused protein expression 
library (derived from precursor cells of the cultured human cell 
NT2) and a nuclear transport assay 

(1) Creation of a fused protein expression library 

mRNA was prepared by culturing precursor cells (Stratagene 
Co.) of the cultured human cell NT2 according to the supplemental 
protocol (Catalog #204101, Revision #036002a) and using the a 
commercial total RNA extraction kit and an mRNA extraction kit 
(Pharmacia Co.). Using a portion thereof (3 lag) , a cDNA library 
was created using a commercial cDNA synthesis kit (Pharmacia 
Co.). Specifically, cDNA synthesis was conducted using 
oligo (dT) 12-18 primer and inserted at the EcoRI/NotI site of the 
pNS vector. The cDNA was unidirect ionally introduced using a 
Directional Cloning Toolbox (Pharmacia Co.). Subsequently, a 
portion of the cDNA library was employed to transform commercial 
E. coli (ElectroMAX DH10B Cells from GibcoBRL Co.) by 
electroporation (Gene Pulser from BIO RAD Co.) conducted in the 
usual manner (New Cytoengineering Test Protocol, Hidemasu Co., 
114-115) . The transf ormants 
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obtained were cultured for 16 hr at 30 *C in an LB agar medium 
comprising ampicillin (100 iJ.m/mL) , and after collecting the 
bacteria, plasmid was prepared (Qiagen Maxi kit from Qiagen Co.). 

(2) Nuclear transport assay employing yeast 

Using 60 yg of the plasmid of the fused protein expression 
library that had been prepared, EGY48 strain was transformed by 
the usual methods (Clontech Co., Yeast Protocols Handbook, 
PT3024-1: 17-20). When the transf ormants were cultured for 3-7 
days at 30 *C in SD agar medium (-His/-Leu) to select the clones 



based on expression of the reporter gene LEU2, about 1,000 
positive clones were obtained. 

(3) Determination of base sequences 

The base sequences of the cDNA fragments inserted into the 
vector were determined for some (12) of the positive clones thus 
obtained. To determine the base sequence, colony PCR was first 
employed to prepare template DNA from each of the clones. A 
small quantity of bacteria scraped from each clone was added to 
20 \1L of PCR reaction solution (0.5 unit of heat-resistant DNA 
polymerase (Ex Taq from TaKaRa Co.), 4 nmol of dNTP mixture, 0.4 
pmol each of "primer NU15" (sequence number 20) and M primer NU36" 
(sequence number 21) , 2 |iL of supplemental buffer, and 
sterilized water) and the inserted cDNA fragments were subjected 
to 40 cycles of amplification using a "GeneAmp PCR System 2400" 
(Perkin Elmer Co.) at a denaturation [temperature] of 94 *C, an 
annealing [temperature] of 60 "C, and an expansion [temperature] 
of 72' C. Each PCR product was subjected to desalting with a 
Microcon-100 (Millipore Co.) and unreacted primer was removed to 
obtain template DNA. A portion (100-200 ng) of the template thus 
obtained was used to determine the base sequence by the method 
described in a product manual from ABI Co. 

(4) Database analysis of the clones obtained 

The base sequence of each clone was searched for in the 
Basic BLAST ( http: //www.ncbi. nlm. nih . qov/cai-bin/BLAST/nph- 
blast? Jform=0 ) of the National Center for Biotechnology 
Information (NCBI) , a public database. As a result, all 12 
clones matched previously known genes. Of those, there were 
reports or suggestions that ten might function within the 
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nucleus. Of the ten, five of the clones were: NP220 having 
nuclear transport signal-like sequences of the SV40 large T 
antigen type, rich in basic amino acids (Inagaki, H., J. Biol. 
Chem., 1996, 271: 12,525-12,531), PC4 (Ge, H., Cell, 1994, 78: 
513-523), ERC-55 (Imai, T., Biochem. Biophys. Res. Commun., 1997, 
233: 765-769), histone-binding protein (O'Rand, M . G., Dev. 
Biol., 1992, 154: 37-44), and prothymocin al (Manrow, R. E., J. 
Biol. Chem., 1991, 266: 3,916-3,924). One clone was hnRNPAl, 
which has an M9 sequence performing round trip movement into and 
out of the nucleus (Michael, W. M . , Cell, 1995, 83: 415-422). 
Four more of the clones were ferritin H chain not having known 
nuclear transport signals (Cai, C. X., J. Biol. Chem. 1997, 272: 
12,831-12,839), Shaperonin 10 (Bonardi, M . A., Biochem. Biophys. 
Res. Commun., 1995, 206: 260-265), protein kinase C inhibitor-I 
(Brzoska, P. M. , Proc. Natl. Acad. Sci., 1995, 92: 7,824-7,828), 
and steroid receptor coactivater-1 (Onate, S. A., Science, 1995, 
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270: 1,354-1,357). No known nuclear transport signal was found 
in the two remaining clones, for which no function within the 
nucleus has yet been reported: tropomyocin (Lin, C. S., Mol. 
Cell. Biol., 1988, 8: 160-168) and G-rich seguence factor-1 
(Qian, Z., Nucleic Acids Res., 1994, 22: 2,334-2,343); 

[Embodiment 5] Creation of a fused protein expression 
library (derived from human fetal brain [cells] ) and a nuclear 
transport assay 

(1) First, a commercial human fetal brain cDNA library 
(Superscript library from GibcoBRL Co.) was amplified according 
to the protocol provided by the manufacturer. Plasmids 
comprising the cDNA fragments as inserts were then prepared using 
a plasmid manufacturing kit from Qiagen Co. Next, cDNA fragments 
cut out from a portion (30 jig) thereof using the two restriction 
enzymes EcoRI and NotI were sorted to obtain cDNA 0.7-4 kb in 
length by 0.8 percent agarose electrophoresis. The cDNA 
fragments thus obtained were inserted at the EcoRI/NotI site of 
the above-described pNS 
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vector to prepare a fused protein expression library. A portion 
thereof was employed to transform commercial E. coli (ElectroMAX 
DH10B Cells from GibcoBRL Co.) by electroporation (Gene Pulser 
from Biorad) using the usual method (New Cytoengineering Test 
Protocol, Hidemasu Co., 114-115). The transf ormants obtained 
were cultured for 16 hr at 30 °C in an LB agar medium comprising 
ampicillin (100 jim/mL) , the bacteria were collected, and plasmid 
was prepared (Qiagen Maxi Kit from Qiagen) . 

(2) Nuclear Transport Assay Employing Yeast 

EGY48 strain was transformed by the usual method (Clontech 
Co., Yeast Protocols Handbook, PT3024-1: 17-20) using 60 \iq of 
plasmid from the fused protein expression library that had been 
prepared. When cultivated for 3-7 days at 30 *C in SD agar 
medium (-His/-Leu) to select clones based on expression of the 
reporter gene LEU2, about 1,000 positive clones were obtained. 

(3) Base sequencing$ 

The base sequences of the cDNA fragments inserted into the 
vector were determined for some (489 clones) of the positive 
clones thus obtained. To conduct sequencing, template DNA was 
prepared from each clone by colony PCR. A small quantity of 
bacterial matter scraped from each clone was added to 20 mL of 
PCR reaction solution (0.5 unit of heat-resistant DNA polymerase 
(Ex Taq from TaKaRa Co.), 4 nmol of dNTP mixture each, 0.4 pmol 
each of "primer NU15" (sequence number 22) and "primer NU36" 
(sequence number 23) , 2 1JL of supplemental buffer, and 
sterilized water) and the inserted cDNA fragments were subjected 



21 



to 40 cycles of amplification using a "GeneAmp PCR System 9600" 
(Perkin Elmer Co.) at a denaturation [temperature] of 94 *C, an 
annealing [temperature] of 60 # C, and an expansion [temperature] 
of 72° C. Each PCR product was subjected to desalting with a 
Microcon-100 (Millipore Co.) and unreacted primer was removed to 
obtain template DNA . A portion (100-200 ng) of the template thus 
obtained was used to determine the base sequence by the method 
described in a product manual from ABI Co. 

(4) Database analysis of the clones obtained 
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The base sequence of each of the 489 clones was searched for 
in the Basic BLAST ( http : //www . ncbi . nlm. nih ■ gov/ cqi- 
bin/BLAST/nph-blast?Jf orm^O ) of the National Center for 
Biotechnology Information (NCBI), a public database. As a 
result, 250 of the clones matched genes coding for 97 known 
proteins (Tables 1 and 2), 220 of the clones were either new 
sequences that were candidates for genes coding for new nuclear 
transport proteins or matched 172 genes coding for known 
expressed sequence tags (EST) . Another 19 of the clones were 
either derived from nontranslation regions of known genes or had 
shifted codon read frames. 

Table 1 shows those of the genes isolated by the method of 
the present invention that code for proteins that have been 
reported to have functions within the nucleus, and Table 2 shows 
those for which no function within the nucleus has been reported. 



Table 1 



Gene 


GenBank 
Accession 


Function 


Starting 
position 
of region 
where 
obtained* 


Structural 
character- 
istics of 
region 
where 
obtained 


Length 
(kb) of 
region 
where 
obtained 


Medline 

ui 4 


1 




RNA binding protein 










2 




Synapse/nuclear protein 










3 




Saccharolytic enzyme 










4 




Bacteria [illeg. 3 /Signal 
transmitting gene 










5 




Transcription factor 










6 




Calcium binding protein 










7 




Transcription factor 










8 




Cyclosporin binding 
protein 










9 




Steroid receptor 
conjugate factor 










10 




Transcription factor 










11 




Gu and p53 interaction 
nuclear protein 










12 




Centromere region 
interaction protein 










13 




Transcription regulating 
factor 










14 




Transcription factor 










15 




DNA cleaving/modifying 
complex 
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16 




Transcription regulating 
factor 










17 




Ribonucleic protein 










18 




Ribonucleic protein 










19 




Heterochromatin protein 










20 




Transcription regulating 
factor 










21 




NLS dependent nuclear 
transport receptor 










o o 




NLS dependent nuclear 
transport receptor 










O T 
Z j 




Nuclear autoantigen 










O /I 




DNA binding protein 










ZD 




Metabolic enzyme 










Z D 




Assumed transcription 
regulating factor 














M-phase phosphoprotein 










28 




Major nuclear matrix 
protein 










29 




Transcription factor 










30 




DNA binding protein 










31 




DNA binding protein 










32 




Contributes to inducing 
cell proliferation 










33 




Transcription regulating 
protein 










34 




Protein interacting with 
homeotic protein BMll 










35 




Assumed transcription 
regulating factor 










36 




Protein interacting with 
kinesin-related proteins 










37 




US snRNP subunit protein 
complex 










38 




Transcription regulating 
factor 










39 




Assumed transcription 
regulating factor 










40 




Transcription factor 










d i 
*i ± 




iiaiiacription racior 










42 




Transcription factor 










43 




DNA cleaving/modifying 
enzyme 










44 




Nuclear membrane complex 
interacting protein 










45 




Assumed transcription 
regulating factor 
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Table 2 



Gene 
* 


GenBank 

Accessio 

n 


Function 


Starting 
position 
of region 
where 
obtained* 


Structural 
characteristics 
of region where 
obtained 


Length 
(kb) of 
region 
where 
obtained 


Medline 

ui 4 


1 




Contributes to 
purine synthesis 
path 










2 




Saccharolytic 
enzyme 










3 




Actin binding 
protein 










4 




Pituitary protein 










5 




MAPKKK Mammal 
homolog 
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6 




Similar to 
ME491/CD63 
superfamily 










7 




Contributes to 
intracellular 
protein transport 










8 




Assumed colorectal 
cancer suppressing 
gene product 










9 




Actin binding 
protein 










10 




Metabolic enzyme 










11 




Metabolic enzyme 










12 




Contributes to 
signal 

transmission 










13 




Endoplasmic 
reticulum calcium 
binding protein 










14 




Actin binding 
protein 










15 




Intermediate 
filament 










16 




Testes/brain 
specific GST 










17 




Assumed Golgi 
complex protein 










18 




Similar to yeast 
CDC10 










19 




Cyclin G 

interacting kinase 










on 




Homolog of 

Hrnc /~tY"^V"\ i 1 a c i n a 
UluoUpillid ollld 
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ITi i rt f +■ "i on nn Vn ni.tn 
c UI1L. U J. KJll UI1 JVlltJ Wll 










22 




Function unknown 














Function unknown 










24 




Function unknown 










25 




Function unknown 










26 




Function unknown 










27 




Function unknown 










28 




Function unknown 










29 




Assumed kinesin 
receptor 










30 




Kinesin motor 
protein super 
family 










31 




G alpha 2 

interacting 

protein 










32 




Metabolic enzyme 










33 




Kinesin motor 
protein super 
family 










34 




Contributes to 
[illeg. ] transport 










35 




Similar to nel 
protein 










36 




aglycon sugar 
protein family 










37 




Intermediate 
filament 










38 




Actin binding 
protein 










39 




Phosphoglycarate 
mutase family 










40 




RacI interacting 
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41 




protein 

Effector protein 
of small GTPase 
Rab5 










42 




Small GTPase Rab5 

interacting 

protein 










43 




Intermediate 
filament 
interacting 
protein 










44 
45 




Ribosome protein 
Assumed 
transcription 
factor 










46 




Contributes to 
[illeg.] transport 










47 




Similar to Grb-2 
having an SH3 
domain 










48 




Signal 

transmission 
adapter molecule 










49 




Assumed 
transcription 
control factor 










50 




Contributes to 
thyroid cancer 










51 




Cell adhesion 
factor 










52 




Intermediate 
filament 











flg*' 

9GB tDkctftf factor 

AO W»V*©id NACP Uynuciain) 

ildoteM A 

b«U eaunin 

c-*o* 

calmodulin 

CREB-2 

cyclopftibn A 

F-SRC-1 

GADD 133 (CHOP) 

Cu bind** preUm 

hCENP-8 



GsnBjnk 



ui* _ 

94263269 

88316381 

92162006 

97O4720B 

63221560 

96H4780 

92279218 

953WH6 

9*791002 

93015930 

97220420 

9t372020 

94266757 

9133427$ 

96292259 

95359*96 

67257972 

9 736 1 139 

9827294 I 

94766902 

96270562 

96770582 

86141726 

69 1 74 76 7 

67053963 

97136679 

97039667 

9 723877 t 

91173312 

96216178 

82391332 

94128073 

94340740 

97220024 

94020841 

98175913 

96154048 

9639741: 

96 1 679 J 7 

96219639 

9024972* 

90316407 

96112794 

97177122 

9674456$ 



13 hCREM-2 



h«at ahock factor I (TCF5) 
HHR23A proiain 



14 
IS 

16 HIRA 

17 hnRMPC 

18 hnRNPK 

19 HPlHa-fanwn» 

20 KSNFZb 

21 Wwporbn alpha 3 

22 karvop**«ri« alpha 3 

23 Ki nuckar autoantifaft 

24 Ku prcta* p70 

25 tactata aartydroflanaaa 

26 buoin* liptMr protwn (hOIP) 

27 M-ph*M phoaphoprotam (mpp6) 

28 matrin 3 

29 NF-fcappa-B p65 aubumt 

30 NP220 
31 
32 
33 
34 
35 
36 
37 
38 



nucUoeindin pr«curtor 
nwdaoaom* •s*«mbiy preiam (NAP) 
PC* 

poS*om«oOc t Kenwloc (HPH1) 
R8P2er*UnotoUttom» tending prwtain 
SMAP 

tptieaoma atiooiaiftd pretatn (SAP 145) 
SW1/SNF comoUM Mibvnit (BAF170) 

39 Ut int«r»euv« protam (T1P60) 

40 TEf =thyrovoa*» •mononie factor 
TFE3 

42 TFEB 

43 lODoiaomarai* lib 

44 TPR 

** TSC-22 
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U2253 UNA K3Y>/<9ft FOSPSRSRSR- 

L08850 v-j-?x/n KEOMPVOPO- 

XC5238 

219054 «B1t»/v?TJL£a#* 

V0I312 l£?B* 

045887 Ht),*,OUA*9W9% 

M86842 ls?B?' 

Y00052 ^-r^QXnfJ>ttft9>/^H 

US9302 ;L-rO«KL«t:?*-*«B* 

540708 lc?B? 

U78524 Cu X o53 fflSttffifc?>' <?K 

X55039 tt«*tt*ffi£t*ffl9>'<?K 

014826 U^tRItB? 

M64673 IsVB^ 

021235 OKAV)ff/MI»ttft* 

X77633 IsVnffB? 

Ml 634 2 'J tfJI 

S74678 'Jtf«*://t?H 

U26312 *^0?0-7*>9>/<9K 

D26I36 ls?UflIB?- 

U93240 NLSa»Mtt»f7U-t:^5- 

D896I6 NLSftfftttt&trU't?*- 

UU292 useo* 

M32B6S DNA«£?;,/(<?K 

V00711 ttWHK 

Z5078I (tJOIiXniB^ 

X98263 MWyA,BHt*://<?ir 

M63483 2E*aTH>vCX»>/<C7ll 

M62399 t£3?B?> 

083032 DNA«fc9>/<?* 

M96824 DNAM&^/C?* 

M86&67 aatatiattacMJ*- 
ut2979 u*n»B* * 

U89277 *>*T^??l//<?KBMI1ffl£ttffl9>'<^HGERDtGNPN-* 
S66431 ttSfl)U3?n«B^ LIEVSIDE70- 
US9919 ^*i/>«a»>/^KfflSffffl9>/<9K Gl KHt MURAL-*- 
U4137I U2 %**HP*t?3>-vh?>A1fLn£Vt ETRIWXKPG-* 
U666I6 KVffBB? RYDFQNPSRH-* 

U74667 nswu^tiiRB* 

U44059 HVB?- 
X5I330 lE^B? 
M33782 lc¥B? 
U5463I 0NA tatt/«*M* . 

U6B668 naaftffmat¥ffl9^/^K 



VELTSSUIIT- 

PTEAE100MI- 
GLVSPSHKSK- 
• 

AINQSKSED0-* 
• 

IK0MVMSLRV- 
EDEODDDDEE- 
• 

LEHVHGSGPt-* 
IPGSPEPEHG- 
GDFSTaFFNS- 
« 

TDPMFYOETY- 
KKKRDaAOXP-* 
VEEK11AAAK-* 
ICLSAVOAAR- 
SAQTOAWOS-*- 
a 

DSfENPVlOO- 
NKITWGVG3— 

KKllSEEWY- 
OCOSOENROO- 
ODRHRIEEKR-* 
IPTGOERTVD— 
QAQUEVWEE-* 
IPEFWUVFK- 



YMDIDEFUE- 
IQELE10A01- 
EITOAESAAI-* 
MDDOOOOHM— 
10HTIROSVG- 
MmRf£VEV~> 



NLS. bZlP 
NLS. bZIP 

NLS. ZIP 



NLS. bZIP 
ZIP 

NLS 

KNS 
NLS 
NLS 



NLS 
NLS 

ZIP 
NLS 
NLS 
NLS 
NLS 
ZIP. EF-hand 
NLS 
NLS 

NLS 
•rm 

NLS 
NLS. ZIP 

NLS 
NLS. bZIP 
NLS 
NLS. bMLHUP 
NLS 

ZIP 



1.8 
0.7 
1.8 
19 
2.0 
1.4 
1.8 
G.8 
2.5 
1.0 
2.« 
17 
2.0 
1.9 
1.8 
2.4 
1.9 
1.5 
1.8 
0.7 
2.4 
1.9 
2.4 
0.8 
2.0 
2.0 
1.7 
1.9 
2.0 
2.5 
1.9 
2.0 
2.0 
2.4 
1.9 
1.9 
1.7 
2.5 
1.9 
2.5 
2.0 
19 
\A 
2.8 
18 
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Gen Sank 








CttiBain 




Accession 








«*(kb) 


1 


ADE2H1 


X53793 




* 




2.0 


2 


aldolase C 


X07292 




YPALSAEQKK-* 




1.8 


3 


alpha-acti run 


XI 5804 




EQVEKGYEEW-* 


coiled-coil. Er-nand 


2.5 


4 


antisecretory factor-1 


U24704 




* 


NLS 


1.8 


S 


ASK1 


08*476 


mapkkk a*ma**a? 


IRTLFLGIPD— 




2.0 


6 


ceil surface glycoprotein 


010653 


ME49I/CD63 X— /<-77S'J-i;aia 


s 




2.0 


7 


coatomer protein (COPA) 


U24105 




GHYONAIYIG— 


WD-40 


2J 


a 


colorectal mutant cancer protatn 


M623S7 


it s mxvk fx to h it aa 


EISS1GVSSS— 


NLS 


4.2 


9 


cytotkeietal tropomyosin TM30 


X04588 




* 


coiled-coil 


2.0 


10 


cytosoiic malate dehydrogenase 


D55654 




t 




1.8 


11 


dihydrolipoamidt dehydrogenase 


JQ3620 




IPVNTRFQTK- 




1.9 


12 


eps*lon M-3~3 protein 


U2B936 




OHVETELKLI-* 




1.8 


13 


ERC-55 


X78669 


'bteffcfcyi^OAis It 


LKOKKRf £KA-» 


Ml C CT— 

Nu, tr^»anp 


1.9 


14 


fibroblast tropomyosin TU30 


X0S276 




YEEEIKUSD— 


coiled-coii 


1 .8 


15 


gkal fibrillary acioic protein (Gf AP) 


J04S69 




OrEAMASSNrW 


coiled— coii 


2.4 


16 


glutathione S-V»nif»rat« M3 (GSTU3) 


J054S9 




ESSMVLGYWD-* 


0.7 


17 


golgin-95 


LOS 147 




QrVAAYOOU-* 


coiled— coil 


2.5 


18 


hCDClO^COCIO homotoc 


S 7 2008 


«©£7)CDClOICH(a 


REHVAKHKKM-* 


NLS. ZIP 


1.8 


19 


HsGAK 


088435 


Tf-r?'»GfflSt*ffl*T— tf 


OGPPEDUSE- 


2.0 


20 


hSJAH2 


U76248 


•>a r 5C?aO/<x »ina o^nc^ 


EHEOICEYRP- 




1.9 


21 


KIAA01 16 


029958 


atwta 


VTlSEAEKVr-* 




0.9 


22 


KIAA0136 


050926 


aa*» 


QLLLVTEEKE- 




2.5 


23 


KIAA0171 


079993 


aexa 


OATMTSSQSH-* 




22 


24 


KIAA0181 


080003 




OHLVStCETST- 


NLS 


0.8 


25 


KIAA0332 


A80C2330 




IPIDATPIDD— 


NLS 


1.7 


26 


KIAA0365 


AB002363 




SGCPLOVKKA— 


NLS 


2.0 


27 


KIAA0373 


AS002371 


agxss 


IISATSOKEA- 


NLS 


0.6 


28 


KIAA0432 


A8007892 




SAPIIMFSAC— 


2-3 


29 


kinectin 


L25616 




OKIOAUNEO-*- 


ceilad-coil 


2.8 


30 


kinesin-2 <HK2) 


Y083I9 




- RVKEITVOPT— 


coiled-coil 


2.4 


31 


LCN protain 


U54999 


G alpha i2 fflSttffi?>/<9lf 


IPHSCRK1SA-* 




1.9 
1.8 


32 


malate dehydrogenase 


U203S2 


ttm»* 


a 




33 


mitotic kir*#si«-iike protein 1 


X67155 




NLS. coiled-coil 


2.6 


34 


N-ethvHnalaimoe-sensitive factor 


U03985 




IASIENDIKP— 




2.5 


35 


nat-relatad protain (NRP 1 ) 


083017 




RNOKHGLFKG-* 


EGF-iike 


2-3 


36 


naurocan (CSPG3) 


AF 026547 




PAOWKAEHS-* 


NLS 


2.0 


37 


neuroftUment-66 


S7S298 




LAFVROVHDE— 


coiled-coil 


2.5 


38 


non-muscle myosin heavy chain- B 


M&9181 




KJtHtSlEAEI-* 


cculed-coil 


2.5 


39 


phosphoglycarate muUse (PCAM-B) 


J04173 




s 




2.4 


40 


porl 


X97567 


Raclffl2ffffl?>y<^K 


FGacsmvo- 


coiled-coil 


1. 8 


41 


Rabaotin-5 


X91I41 


small GTPasa Rab50)r7z?9— 5>><99t 


IGIQEAiTRD— 


coiled-coil 


2.0 


42 


Rap2 interacting protein 8 CRPIP8) 


U93871 


sma« GTPasa Rap2fflZf*ffl$l//<?)r 


KFRIVYAQXG-*- 




1.8 


43 


ret tin 


X64838 




KFIKOADEEk^ 


NLS. coiled-coil 


2.3 


44 


RIG-tike 7-1 


AF034208 


Uit«V— 


DKQRDCOPGI-* 




1.8 


45 


RING line finger protein (RZF) 


AF037204 




KTWCTCPVCK-* 




1.9 


46 


secratogranm Kehromogranin 3) 


Y00064 




PEYGEEIKCY— 


NLS 


1.7 


47 


SH3GL2 


X996S7 


SH3K>-r>Ef#-3Grb-2i;CS« 


LH0KCLREIO-* 


NLS. SH3 


2.1 


48 


STAM 


U438S9 




OPNWKGETH— 


1TAM 


2.4 


49 


taxi -banding protein TX8P15I 


U338Z1 




SKEDTCFUE-* 




1.9 


SO 


TFG protain 


Y07968 


?tttt(8i:M4 


LRREUELRH- 


coiled-coil 


1.9 


51 


trophinin 


U04811 




PSKSIGPGAA— 




1.9 


52 


wimentin 


Z19554 




E10A0I0EOH- 


coiled-coii 


».7 
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The symbols in the tables have the following meanings: 

a : Indicates that the protein obtained represents the 
shortest inserted fragment in the group to which assigned. 

b : Indicates 10 amino acid residues from the amino terminus 
of the protein coded for by the inserted gene fragment. 

c : Medline Unique Identifier of the document reporting a 
function in the nucleus. 

*: A clone comprising the entire translation region. 

S/R rich: A serin/alginin rich region. 

NLS : Assumed nuclear transport signal rich in basic 
residues . 
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ZIP: Leucin zipper 

bZIP: Basic leucin zipper 

KNS: hnRNP K nuclear transport signal 

arm: Armaggio repeat 

bHLHZIP: Basic helix loop helix leucin zipper 
SH3: Src homology domain 3 

ITAM: Immunoreceptor tyrosine-based activation motif 

As shown in Tables 1 and 2, about half of the 97 known 
proteins were proteins reported to have functions within the 
nucleus. The ratio of transcription control factors and DNA/RNA 
splicing proteins was particularly high. Accordingly, even as 
regards the new genes, it may be readily anticipated that genes 
coding for unknown proteins functioning within the nucleus will 
be efficiently and specifically obtained. Further, with regard 
to the hnRNPK protein among the isolated clones, the KNS sequence 
(Matthew, W. , EMBO J., 1997, 16: 3,587-3,598) responsible for 
back and forth movement into and out of the nucleus was found. 
The finding of the M9 sequence and the KNS sequence that are 
responsible for movement into and out of the nucleus among the 
clones isolated by the method of the present invention 
demonstrates that the method of the present invention is not only 
capable of specifically selecting with high efficiency just 
nuclear transport proteins, but can also be expanded to the 
general selection of proteins moving into and out of the nucleus 
(outside the nucleus -> inside the nucleus, inside the nucleus -> 
outside the nucleus) . 

[Embodiment 6] Demonstration of the efficacy of the nuclear 

transport protein trap vector M pNS" based on the fusion of cDNA 

coding for known nuclear transport proteins 

(1) Construction of fused plasmids of known cDNA fragments 
cDNA in the form of BraC" (TANAKA, Mahito, New 

Biochemistry Experiment Lecture 6 (Ed. by the Japan Biochemistry 



27 



Society), Biomembranes and Membrane Transport (2/2), 1992, Tokyo 
Chemistry Club, 9 15) and calcium/calmodulin dependent protein 
kinase kinase " CaMKK" (Tokumitsu, H., J. Biol. Chem. , 1995, 270 
(33): 19,320-19,324; Tokumitsu, Hiroshi, "Localization of CaMKK 
in Cells'', unreleased data) were employed as representative 
proteins localized in cytoplasm. cDNA in the form of SV40 "NLS", 
"NLS-'BraC" obtained by artificially fusing SV40 " NLS" and 
"' BraC" , the transcription factor NF-kappa-B 

/27 

p65 subunit " NFKBp65" (Ganchi, P. A., Mol. Biol. Cell, 1992, 
3(12): 1,339-1,352), and the transcription factor "c-Fos" 
(Tratner, I., Oncogene, 1991, 6(11): 2,049-2,053) was employed as 
representative proteins localized in the nucleus and having 
conventional nuclear transport signals. The plasmid "pRSIF" was 
employed for " LexAD" , "pNS" for "NES-LexAD" , "pRS3FN" for "NES- 
LexAD-NLS", "pRS3F'BraC" for "NES-LexAD-' BraC" , and "pRS3FN' BraC" 
for " NES-LexAD-NFKBp65" , respectively. " NES-LexAD-NFKBp65" was 
prepared by amplifying "NFKBp65" by PCR employing the primers 
"NU32" (sequence number 24) and "NU24" (sequence number 25) and 
employing " PME18S (N) -p65" (Tsuboi, A., Biochem. Biophys. Res. 
Commun., 1994, 199(2): 1,064-1,072) as template, refining the 
fragments by digestion with the restriction enzymes Muni and 
NotI, and inserting the fragments into the EcoRi/NotI site of 
"pNS" . Similarly, "NES-LexAD-cFOS" was prepared by amplifying 
"c-FOS" by PCR employing the primers "NU34" (sequence number 2 6) 
and "NU24" and employing " PME18S (N) -cFos" (Tsuboi, A., Biochem. 
Biophys. Res. Commun., 1994, 199(2): 1,064-1,072) as template 
followed by insertion into the EcoRI/NotI site of "pNS" . " NES- 
LexAD-CaMKK" was prepared by digesting " pET -CaMKK" (provided by 
Mr. Hiroshi TOKUMITSU) with the restriction enzyme Ncol to obtain 
a "CaMKK" cDNA fragment, which was then inserted at the Ncol site 
of "pNS" . 

(2) These plasmids were each introduced into EGY48 strain 
and expression of the reporter gene LEU2 was observed. Following 
transformation with the various plasmids described above in (1), 
direct plating on SD culture (-HIS, -LEU) was conducted. "LexAD" 
not having NES was thought to form colonies because of passive 
diffusion into the nucleus. The formation of colonies in " NES- 
LexAD" into which NES had been introduced was completely 
inhibited. However, in " NES-LexAD-NLS" into which NLS had been 
additionally incorporated, colony formation was again observed. 
Similarly, in "NES-LexAD-NLS-' BraC" , " NES-LexAD-NFKBp65" , and 
"NES-LexAD-cFos" , all of which comprised conventional NES, colony 
formation was observed. In " NES-LexAD-' BraC" and "NES-LexAD- 
CaMKK" which did not have nuclear transportability, colony 
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formation was completely inhibited. These results demonstrate 
that the specific detection of cDNA fragments having nuclear 
transportability is possible in systems employing "pNS" vector. 

728 

Potential For Industrial Use 

Based on the present invention, it is possible to 
conveniently detect whether or not a peptide coded for by test 
DNA has nuclear transportability by employing as indicator the 
expression of a reporter gene. Further, it is possible to 
rapidly, efficiently, and comprehensively clone DNA coding for a 
protein having nuclear transportability by employing as indicator 
the expression of a reporter gene. Based on the present 
invention, not only is the obtaining of DNA coding for new 
intranuclear proteins of biological importance advanced, but 
extremely useful gene expression information (time, place, 
expression frequency, and the like) with regard to research into 
the functioning of proteins in the nucleus can be provided. 
Further, the use of this information is expected to contribute to 
the development of epoch-marking drugs. 
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Sequence Table 

(1) Name or designation of applicant: Helix Research Institute 

(2) Title of Invention: METHODS FOR DETECTING AND ISOLATING 
NUCLEAR TRANSPORT PROTEINS 

(3) Filing Number: H1-804DP1PCT 

(4) Application Number: 

(5) Filing Date: 

(6) Name of Country and Numbers of Applications Relied on for 
Priority: 

Japan Patent Application No. Hei 9-124795 

Japan Patent Application No. Hei 9-309686 

(7) Priority Date: April 28, 1998 

October 24, 1998 

(8) Number of Sequences: 26 

Sequence number: 1 

Length of sequence: 30 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Another nucleic acid Synthetic DNA 
Sequence : 

TTTGAATTCG CCAATTTTAA TCAAAGTGGG 30 
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Sequence number: 2 

Length of sequence: 32 

Form of sequence: Nucleic acid 

Number of strands: One 
Topology: Straight chain 

Type of Sequence: Another nucleic acid Synthetic DNA 
Sequence : 

TAGCATCTAT GACTTTTTGG GGCGTTCAAG 

Sequence number: 3 

Length of sequence: 342 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: cDNA to mRNA 

Sequence characteristics: 

Code denoting characteristics: domain 

Position where present: 1..342 

Method of determining characteristic: S 

Sequence : 



GCC AAT TTT AAT CAA AGT GGG AAT ATT GCT GAT AGC TCA TTG TCC TTC 
Ala Asn Phe Asn Gin Ser Gly Asn He Ala Asp Ser Ser Leu Ser Phe 
1 5 10 15 

ACT TTC ACT AAC AGT AGC AAC GGT CCG AAC CTC ATA ACA ACT CAA ACA 
Thr Phe Thr Asn Ser Ser Asn Gly Pro Asn Leu He Thr Thr Gin Thr 

20 25 30 

AAT TCT CAA GCG CTT TCA CAA CCA ATT GCC TCC TCT AAC GTT CAT GAT 
Asn Ser Gin Ala Leu Ser Gin Pro He Ala Ser Ser Asn Val His Asp 
35 40 45 
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AAC TTC ATG AAT AAT GAA ATC ACG GCT AGT AAA ATT GAT GAT GGT AAT 192 
Asn Phe Met Asn Asn Glu lie Thr Ala Ser Lys He Asp Asp Gly Asn 

50 55 60 

AAT TCA AAA CCA CTG TCA CCT GGT TGG ACG GAC CAA ACT GCG TAT AAC 240 
Asn Ser Lys Pro Leu Ser Pro Gly Trp Thr Asp Gin Thr Ala Tyr Asn 
65 70 75 80 

GCG TTT GGA ATC ACT ACA GGG ATG TTT AAT ACC ACT ACA ATG GAT GAT 288 
Ala Phe Gly lie Thr Thr Gly Met Phe Asn Thr Thr Thr Met Asp Asp 

85 90 95 

GTA TAT AAC TAT CTA TTC GAT GAT GAA GAT ACC CCA CCA AAC CCA AAA 336 
Val Tyr Asn Tyr Leu Phe Asp Asp Glu Asp Thr Pro Pro Asn Pro Lys 

100 105 110 

AAA GAG 342 
Lys Glu 

Sequence number: 4 

Length of sequence: 609 

Form of sequence: Nucleic acid 

Number of strands: Two 

Topology: Straight chain 

Type of Sequence: cDNA to mRNA 

Sequence characteristics: 

Code denoting characteristics: CDS 

Position where present: 1..606 

Method of determining characteristic: S 
Sequence : 
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ATG AAA 
Met Lys 
1 

GAT CAC 
Asp His 

CAG CGT 
Gin Arg 

GCG CTG 
Ala Leu 
50 
GGG ATT 
Gly He 
65 

CGT GTG 
Arg Val 

CAT TAT 
His Tyr 

CTG CGC 
Leu Arg 

GAC TTG 
Asp Leu 



GCG TTA ACG 
Ala Leu Thr 
o 

ATC AGC CAG 
He Ser Gin 
20 

TTG GGG TTC 
Leu Gly Phe 
35 

GCA CGC AAA 
Ala Arg Lys 

CGT CTG TTG 
Arg Leu Leu 

GCT GCC GGT 
Ala Al'a Gly 
85 

CAG GTC GAT 
Gin Val Asp 

100 
GTC AGC GGG 
Val Ser Gly 
115 

CTG GCA GTG 
Leu Ala Val 



GCC AGG CAA 
Ala Arg Gin 

ACA GGT ATG 
Thr Gly Met 

CGT TCC CCA 
Arg Ser Pro 
40 

GGC GTT ATT 
Gly Val He 
55 

CAG GAA GAG 
Gin Glu Glu 
70 

GAA CCA CTT 
Glu Pro Leu 

CCT TCC TTA 
Pro Ser Leu 

ATG TCG ATG 
Met Ser Met 
120 

CAT AAA ACT 
His Lys Thr 



CAA GAG 
Gin Glu 
10 

CCG CCG 
Pro Pro 
25 

AAC GCG 
Asn Ala 

GAA ATT 
Glu He 

GAA GAA 
Glu Glu 

CTG GCG 
Leu Ala 
90 

TTC AAG 
Phe Lys 
105 

AAA GAT 
Lys Asp 

CAG GAT 
Gin Asp 



GTG TTT 
Val Phe 

ACG CGT 
Thr Arg 

GCT GAA 
Ala Glu 

GTT TCC 
Val Ser 
60 

GGG TTG 
Gly Leu 
75 

CAA CAG 
Gin Gin 

CCG AAT 
Pro Asn 

ATC GGC 
He Gly 

GTA CGT 
Val Arg 



GAT CTC 
Asp Leu 

GCG GAA 
Ala Glu 
30 

GAA CAT 
Glu His 
45 

GGC GCA 
Gly Ala 

CCG CTG 
Pro Leu 

CAT ATT 
His He 



GCT GAT 
Ala Asp 
110 
ATT ATG 
He Met 
125 

AAC GGT 
Asn Gly 



ATC CGT 
He Arg 
15 

ATC GCG 
lie Ala 

CTG AAG 
Leu Lys 

TCA CGC 
Ser Arg 

GTA GGT 
Val Gly 
80 
GAA GGT 
Glu Gly 
95 

TTC CTG 
Phe Leu 
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GAT GGT 
Asp Gly 

CAG GTC 
Gin Val 



96 



144 



192 



240 



288 



336 



384 



432 
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130 135 140 

GTT GTC GCA CGT ATT GAT GAC GAA GTT ACC GTT kkG CGC CTG AAA AAA 
Val Val Ala Arg He Asp Asp Glu Val Thr Val Lys Arg Leu Lys Lys 
145 150 155 160 

CAG GGC AAT AAA GTC GAA CTG TTG CCA GAA AAT AGC GAG TTT AAA CCA 
Gin Gly Asn Lys Val Glu Leu Leu Pro Glu Asn Ser Glu Phe Lys Pro 

165 170 175 

ATT GTC GTT GAC CTT CGT CAG CAG AGC TTC ACC ATT GAA GGG CTG GCG 
He Val Val Asp Leu Arg Gin Gin Ser Phe Thr He Glu Gly Leu Ala 

180 185 190 

GTT GGG GTT ATT CGC.AAC GGC GAC TGG CTG TAA 
Val Gly Val He Arg Asn Gly Asp Trp Leu 
195 200 



Sequence number: 5 

Length of sequence: 10 

Form of sequence: Amino acid 

Topology: Straight chain 

Type of Sequence: Peptide 

Sequence 

Gin Leu Pro Pro Leu Glu Arg Leu Thr Leu 
15 10 

Sequence number: 6 

Length of sequence: 30 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: synthetic DNA 

Sequence 

ACAGCTGCCA CCGATTGAGA GACTTACGTT 
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Sequence number: 7 

Length of sequence: 30 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

TGTCGACGGT GGCTAACTCT CTGAATGCAA 



Sequence number: 8 

Form of sequence: Nucleic acid 

Number of strands: Two 

Topology: Straight chain 

Type of Sequence: cDNA to mRNA 

Sequence characteristics : 

Code denoting characteristics: CDS 
Position where present: 1..1077 
Method of determining characteristic: E 

Sequence : 



ATG AAA GCG TTA CAG CTG CCA CCG ATT GAG AGA CTT ACG TTA ACG GCC 



Met Lys Ala Leu Gin Leu Pro Pro He Glu Arg Leu Thr Leu Thr Ala 
15 10 15 

AGG CAA CAA GAG GTG TTT GAT CTC ATC CGT GAT CAC ATC AGC CAG ACA 
Arg Gin Gin Glu Val Phe Asp Leu lie Arg Asp His He Ser Gin Thr 

20 25 30 

GGT ATG CCG CCG ACG CGT GCG GAA ATC GCG CAG CGT TTG GGG TTC CGT 
Gly Met Pro Pro Thr Arg Ala Glu He Ala Gin Arg Leu Gly Phe Arg 

35 40 45 

TCC CCA AAC GCG GCT GAA GAA CAT CTG AAG GCG CTG GCA CGC AAA GGC 
Ser Pro Asn Ala Ala Glu Glu His Leu Lys Ala Leu Ala Arg Lys Gly 

50 55 60 

GTT ATT GAA ATT GTT TCC GGC GCA TCA CGC GGG ATT CGT CTG TTG CAG 
Val He Glu He Val Ser Gly Ala Ser Arg Gly He Arg Leu Leu Gin 
65 70 75 80 

GAA GAG GAA GAA GGG TTG CCG CTG GTA GGT CGT GTG GCT GCC GGT GAA 
Glu Glu Glu Glu Gly Leu Pro Leu Val Gly Arg Val Ala Ala Gly Glu 

' 85 90 95 

CCA CTT CTG GCG CAA CAG CAT ATT GAA GGT CAT TAT CAG GTC GAT CCT 
Pro Leu Leu Ala Gin Gin His He Glu Gly His Tyr Gin Val Asp Pro 

100 105 110 

TCC TTA TTC AAG CCG AAT GCT GAT TTC CTG CTG CGC GTC AGC GGG ATG 
Ser Leu Phe Lys Pro Asn Ala Asp Phe Leu Leu Arg Val Ser Gly Met 

115 120 125 

TCG ATG AAA GAT ATC GGC ATT ATG GAT GGT GAC TTG CTG GCA GTG CAT 
Ser Met Lys Asp He Gly He Met Asp Gly Asp Leu Leu Ala Val His 

130 135 140 



35 



AAA ACT CAG GAT GTA CGT AAC GGT CAG GTC GTT GTC GCA CGT ATT GAT 480 

Lys Thr Gin Asp Val Arg Asn Gly Gin Val Val Val Ala Arg lie Asp 

145 150 155 160 

GAC GAA GTT ACC GTT AAG CGC CTG AAA AAA CAG GGC AAT AAA GTC GAA 528 

Asp Glu Val Thr Val Lys Arg Leu Lys Lys Gin Gly Asn Lys Val Glu 

165 170 175 

CTG TTG CCA GAA AAT AGC GAG TTT AAA CCA ATT GTC GTT GAC CTT CGT 576 
Leu Leu Pro Glu Asn Ser Glu Phe Lys Pro He Val Val Asp Leu Arg 

180 185 190 

CAG CAG AGC TTC ACC ATT GAA GGG CTG GCG GTT GGG GTT ATT CGC AAC 624 
Gin Gin Ser Phe Thr He Glu Gly Leu Ala Val Gly Val He Arg Asn 

195 200 205 

GGC GAC TGG CTG GAA TTC GCC AAT TTT AAT CAA AGT GGG AAT ATT GCT 672 
Gly Asp Trp Leu Glu Phe Ala Asn Phe Asn Gin Ser Gly Asn He Ala 

210 215 220 

GAT AGC TCA TTG TCC TTC ACT TTC ACT AAC AGT AGC AAC GGT CCG AAC 720 
Asp Ser Ser Leu Ser Phe Thr Phe Thr Asn Ser Ser Asn Gly Pro Asn 
225 230 235 240 

CTC ATA ACA ACT CAA ACA AAT TCT CAA GCG CTT TCA CAA CCA ATT GCC 768 
Leu He Thr Thr Gin Thr Asn Ser Gin Ala Leu Ser Gin Pro He Ala 

245 250 255 

TCC TCT AAC GTT CAT GAT AAC TTC ATG AAT AAT GAA ATC ACG GCT AGT 816 
Ser Ser Asn Val His Asp Asn Phe Met Asn Asn Glu He Thr Ala Ser 

260 265 270 

AAA ATT GAT GAT GGT AAT AAT TCA AAA CCA CTG TCA CCT GGT TGG ACG 864 
Lys He Asp Asp Gly Asn Asn Ser Lys Pro Leu Ser Pro Gly Trp Thr 
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275 280 285 

GAC CAA ACT GCG TAT AAC GCG TTT GGA ATC ACT ACA GGG ATG TTT AAT 912 
Asp Gin Thr Ala Tyr Asn Ala Phe Gly He Thr Thr Gly Met Phe Asn 

290 295 300 

ACC ACT ACA ATG GAT GAT GTA TAT AAC TAT CTA TTC GAT GAT GAA GAT 960 
Thr Thr Thr Met Asp Asp Val Tyr Asn Tyr Leu Phe Asp Asp Glu Asp 
305 310 315 320 

ACC CCA CCA AAC CCA AAA AAA GAG ATC TCT ATG GCT TAC CCA TAC GAT 1008 
Thr Pro Pro Asn Pro Lys Lys Glu He Ser Met Ala Tyr Pro Tyr Asp 

325 330 335 

GTT CCA GAT TAC GCT AGC TTG GGT GGT CAT ATG GCC ATG GCG GCC GCT 1056 
Val Pro Asp Tyr Ala Ser Leu Gly Gly His Met Ala Met Ala Ala Ala 

340 345 350 

CGA GTC GAC CTG CAG CCA AGC TAA 1080 
Arg Val Asp Leu Gin Pro Ser 
355 



Sequence number: 9 

Form of sequence: Nucleic acid 

Number of strands: Two 

Topology: Straight chain 

Type of Sequence: cDNA to mRNA 

Sequence characteristics: 

Code denoting characteristics: CDS 

Position where present: 1..1149 

/38 

Method of determining characteristic: S 
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Sequence : 



ATG GCT AAG ATC TCT CCC GGG CTC GAG CTC ATG AAG AAG GGT 
Met Ala Lys lie Ser Pro Gly Leu Glu Leu Met Lys Lys Gly 
1 5 10 

CGT CTA TCC CGC CTG TTC GCC GCG ATG GCC ATT GCC GGG TTC 
Arg Leu Ser Arg Leu Phe Ala Ala Met Ala He Ala Gly Phe 
20 25 30 

TAC TCC ATG GCC GCC GAC ACC ATC AAG ATC GCC CTG GCT GGC 
Tyr Ser Met Ala Ala Asp Thr lie Lys He Ala Leu Ala Gly 

35 40 45 

ACC GGT CCG GTA GCC CAG TAC GGC GAC ATG CAG CGC GCC GGT 
Thr Gly Pro Val Ala Gin Tyr Gly Asp Met Gin Arg Ala Gly 

50 55 60 

ATG GCA ATC GAA CAG ATC AAC AAG GCA GGC GGC GTG AAC GGC 
Met Ala He Glu Gin lie Asn Lys Ala Gly Gly Val Asn Gly 
65 ' 70 75 

CTC GAA GGC GTG ATC TAC GAC GAC GCC TGC GAT CCC AAG CAG 
Leu Glu Gly Val lie Tyr Asp Asp Ala Cys Asp Pro Lys Gin 

85 90 
GCG GTC GCC AAC AAG GTG GTC AAC GAC GGC GTC AAG TTC GTG 
Ala Val Ala Asn Lys Val Val Asn Asp Gly Val Lys Phe Val 
100 105 no 

CAT GTC TGC TCC AGC TCC ACC CAA CCC GCC ACC GAC ATC TAC 
His Val Cys Ser Ser Ser Thr Gin Pro Ala Thr Asp He Tyr 
115 120 125 



ACT CAG 
Thr Gin 
15 

GCC AGC 
Ala Ser 

CCG GTC 
Pro Val 

GCG CTG 
Ala Leu 

GCG CAA 
Ala Gin 
80 

GCC GTG 
Ala Val 
95 

GTC GGT 
Val Gly 

GAA GAC 
Glu Asp 
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96 



144 



192 



240 



288 



336 



384 
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GAA GGC GTG CTG ATG ATC ACC CCG TCG GCC ACC GCC CCG GAA ATC ACC 432 
Glu Gly Val Leu Met He Thr Pro Ser Ala Thr Ala Pro Glu He Thr 

130 135 140 

TCG CGC GGC TAC AAG CTG ATC TTC CGC ACC ATC GGC CTG GAC AAC ATG 480 
Ser Arg Gly Tyr Lys Leu lie Phe Arg Thr He Gly Leu Asp Asn Met 
145 150 155 160 

CAG GGC CCG GTG GCC GGC AAG TTC ATC GCC GAA CGC TAC AAG GAC AAG 528 
Gin Gly Pro Val Ala Gly Lys Phe He Ala Glu Arg Tyr Lys Asp Lys 

165 170 175 

ACC ATC GCG GTA CTG CAC GAC AAG CAG CAG TAC GGC GAA GGC ATC GCC 576 
Thr lie Ala Val Leu His Asp Lys Gin Gin Tyr Gly Glu Gly He Ala 

180 185 190 

ACC GAG GTG AAG AAG ACC GTG GAA GAC GCC GGC ATC AAG GTT GCC GTC 624 
Thr Glu Val Lys Lys Thr Val Glu Asp Ala Gly He Lys Val Ala- Val 

195 200 205 

TTC GAA GGC CTG AAC GCC GGC GAC AAG GAC TTC AAC GCG CTG ATC AGC 672 
Phe Glu Gly LeXi Asn Ala Gly Asp Lys Asp Phe Asn Ala Leu He Ser 

210 215 220 

AAG CTG AAG AAA GCC GGC GTG CAG TTC GTC TAC TTC GGC GGC TAC CAC 720 
Lys Leu Lys Lys Ala Gly Val Gin Phe Val Tyr Phe Gly Gly Tyr His 
225 230 235 240 

CCA GAA ATG GGC CTG CTG CTG CGC CAG GCC AAG CAG GCC GGG CTG GAC 768 
Pro Glu Met Gly Leu Leu Leu Arg Gin Ala Lys Gin Ala Gly Leu Asp 

245 250 255 

GCG CGC TTC ATG GGC CCG GAA GGG GTC GGC AAC AGC GAA ATC ACC GCG 816 
Ala Arg Phe Met Gly Pro Glu Gly Val Gly Asn Ser Glu He Thr Ala 
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ATC GCC 
He Ala 

TTC GAG 
Phe Glu 
290 
AAG AAC 
Lys Asn 
305 

GTC ACA 
Val Thr 

GAG AAG 
Glu Lys 

GGG AAC 
Gly Asn 

ACC GTC 
Thr Val 
370 

TAA 



260 
GGC GAC 
Gly Asp 
275 

CAG GAT 
Gin Asp 

CAG GAT 
Gin Asp 

GTG ATC 
Val He 

GTC GCC 
Val Ala 
340 
CTC GGG 
Leu Gly 
355 

TAC GAG 
Tyr Glu 



GCT TCG GAA 
Ala Ser Glu 

CCG AAG AAC 
Pro Lys Asn 
295 

CCG AGC GGC 
Pro Ser Gly 

310 
GCC AAG GGC 
Ala Lys Gly 
325 

GAG GCC CTG 
Glu Ala Leu 

TTC GAC GAG 
Phe Asp Glu 

TGG CAC AAG 
Trp His Lys 
375 



265 
GGC ATG 
Gly Met 
280 

AAG GCC 
Lys Ala 

ATC TTC 
lie Phe 

ATC GAG 
He Glu 

CGC GCC 
Arg Ala 
345 
AAG GGC 
Lys Gly 
360 

GAC GCC 
Asp Ala 



CTG GCG 
Leu Ala 

CTG ATC 
Leu He 

GTC CTG 
Val Leu 
315 
AAA GCC 
Lys Ala 
330 

AAC ACC 
Asn Thr 

GAC CTG 
Asp Leu 

ACC CGG 
Thr Arg 



ACC CTG 
Thr Leu 
285 
GAC GCC 
Asp Ala 
300 

CCC GCC 
Pro Ala 

GGC GAG 
Gly Glu 

TTC GAG 
Phe Glu 

AAG AAC 
Lys Asn 
365 
ACC GAG 
Thr Glu 
380 



270 

CCG CGC GCC 
Pro Arg Ala 

TTC AAG GCG 
Phe Lys Ala 

TAC TCC GCG 
Tyr Ser Ala 
320 

GCC GAT CCG 
Ala Asp Pro 

335 
ACT CCC ACC 
Thr Pro Thr 
350 

TTC GAC TTC 
Phe Asp Phe 

GTC AAG 
Val Lys 



864 



912 



960 



1008 



1056 



1104 



1149 



1152 
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Sequence number: 10 
Length of sequence: 12 



Form of sequence: Amino acid 
Topology: Straight chain 
Type of Sequence: Peptide 
Sequence 

Ser Glu Pro Pro Lys Lys Lys Arg Lys Val Glu Thr 
15 10 



Sequence number: 11 

Length of sequence: 37 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

CTAGCGAGCC TCCAAAAAAG AAGAGAAAGG TCGAAAC 37 



Sequence number: 12 

Length of sequence: 37 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

GCTCGGAGGT TTTTTCTTCT CTTTCCAGCT TTGGTAC 



Sequence number: 13 



Length of sequence: 419 

Form of sequence: Nucleic acid 

Number of strands: Two 

Topology: Straight chain' 

Type of Sequence: Genomic DNA 

Sequence 



TCGACTGCTG TATATAAAAC CAGTGGTTAT ATGTACAGTA CTGCTGTATA TAAAACCAGT 60 

GGTTATATGT ACAGTACGTC GAGGGAATCA AATTAACAAC CATAGGATGA TAATGCGATT 120 

AGTTTTTTAG CCTTATTTCT GGGGTAATTA ATCAGCGAAG CGATGATTTT TGATCTATTA 180 

ACAGATATAT AAATGCAAAA ACTGCATAAC CACTTTAACT AATACTTTCA ACATTTTCGG 240 

TTTGTATTAC TTCTTATTCA AATGTAATAA AAGTATCAAC AAAAAATTGT TAATATACCT 300 

CTATACTTTA ACGTCAAGGA GAAAAAACTA TAATGACTAA ATCTCATTCA GAAGAAGTGA 360 

TTGTACCTGA GTTCAATTCT AGCGCAAAGG AATTACCAAG ACCATTGGCC GAAAAGTGC 419 



Sequence number: 14 

Length of sequence: 12 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

AATTGACCAC CC 12 



Sequence number: 15 
Length of sequence: 12 
Form of sequence: Nucleic acid 

LAI 

Number of strands: One 
Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

CTGGTGGGTT AA 12 



Sequence number: 
Length of sequence 
Form of sequence: 
Number of strands: 
Topology: Straight 
Type of Sequence: 
Sequence 
CTAGCTTGGG 



16 
25 

Nucleic acid 
One 
chain 

Other nucleic acid 
TGGAATTCAT 



synthetic DNA 
ATGGC 25 



Sequence number: 17 
Length of sequence: 24 
Form of sequence: Nucleic acid 
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Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

GAACCCACCT TAAGTATACG GTAC 



Sequence number: 18 
Length of sequence: 11 



Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

CTGCATGCAC C 11 



Sequence number: 19 

Length of sequence: 14 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

ATGGACGTAC GTGG 14 



Sequence number: 20 

Length of sequence: 32 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

CTATTCGATG ATGAAGATAC CCCACCAAAC CC 



Sequence number: 21 



Length of sequence: 30 
Form of sequence: Nucleic acid 
Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

GAAATTCGCC CGGAATTAGC TTGGCTGCAG 



Sequence number: 
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Length of sequence: 32 
Form of sequence: Nucleic acid 
Number of strands: One 
Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

CTATTCGATG ATGAAGATAC CCCACCAAAC CC 32 

Sequence number: 23 

Length of sequence: 30 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

GAAATTCGCC CGGAATTAGC TTGGCTGCAG 30 

Sequence number: 24 

Length of sequence: 32 

Form of sequence: Nucleic acid 

Number of strands : One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

TTTCAATTGG AATGGACGAA CTGTTCCCCC TC 32 

Sequence number: 25 

Length of sequence: 35 

Form of sequence: Nucleic acid 

Number of strands : One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

GCGCAGCGAG TCAGTGAGCG AGGAAGCGGA AGAGG 35 

Sequence number: 2 6 

Length of sequence: 35 

Form of sequence: Nucleic acid 

Number of strands: One 

Topology: Straight chain 

Type of Sequence: Other nucleic acid synthetic DNA 
Sequence 

TTTGAATTCT AATGATGTTC TCGGGTTTCA ACGCG 35 

Z47 
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Claims 

(1) A method of detecting the nuclear transportability of a 
peptide coded for by test DNA, characterized in that fused DNA of 
DNA coding for transcription factor not having nuclear 
transportability and test DNA is introduced into a eukaryotic 
host having in the nucleus thereof a promoter region that is 
activated by the binding of transcription factor and a reporter 
gene connected downstream from said promoter region, and in that 
expression of the reporter gene is detected. 

(2) The method of claim (1) wherein said transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA bonding domain, and a 
transcription activation domain. 

(3) The method of claim (1) wherein said transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and a GAL4 
transcription activation domain, and wherein the promoter region 
that is activated by binding of the transcription factor is the 
promoter region of a GAL1 gene in which the operator sequence has 
been replaced with the LexA operator sequence. 

(4) The method of claim (3) wherein said nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5. 

(5) The method of any of claims (l)-(4) wherein said 
reporter gene is the LEU2 and/or |3-galactosidase gene. 

(6) A method of isolating DNA coding for a peptide having 
nuclear transportability characterized in that fused DNA of DNA 
coding for transcription factor not having nuclear 
transportability and test DNA is introduced into a eukaryotic 
host having in its nucleus a promoter region that is activated by 
binding of said transcription factor and a reporter gene 
connected downstream from said promoter region, in that 
expression of the reporter gene is detected, and in that test DNA 
is isolated from a eukaryotic host in which said expression has 
been detected. 

(7) The method of claim (6) wherein said transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA binding domain, and a 
transcription activation domain. 

748 

(8) The method of claim (6) wherein said transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and a GAL 4 
transcription activation domain, and wherein said promoter region 
that is activated by binding of the transcription factor is the 
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promoter region of a GAL1 gene in which the operator sequence has 
been replaced with the LexA operator sequence. 

(9) The method of claim (8) wherein said nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5. 

(10) The method of any of claims (6) -(9) wherein said 
reporter gene is the LEU2 and/or |3-galactosidase gene. 

(11) A vector having an incorporation site of test DNA 
adjacent to DNA coding for transcription factor not having 
nuclear transportability . 

(12) The vector of claim (11) wherein said transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, a DNA binding domain, and a 
transcription activation domain. 

(13) The vector of claim (11) wherein said transcription 
factor not having nuclear transportability is a fused protein 
comprising a nuclear export signal, LexA protein, and the GAL4 
transcription activation domain. 

(14) The vector of claim (13) wherein said nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5. 

(15) A kit comprising: 

(1) a vector having an incorporation site for test DNA 
adjacent to DNA coding for transcription factor not having 
nuclear transportability; and 

(2) a eukaryotic host having in its nucleus an expression 
unit comprising a promoter region activated by binding of 
said transcription factor and a reporter gene connected 
downstream from said promoter region. 

(16) The kit of claim (15) wherein said transcription factor 
not having nuclear transportability is a fused protein comprising 
a nuclear export signal, a DNA binding domain, and a 
transcription activation domain. 
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(17) The kit of claim (15) wherein said transcription factor 
not having nuclear transportability is a fused protein comprising 
a nuclear export signal, LexA protein, and a GAL4 transcription 
activation domain; wherein said promoter region that is activated 
by binding of said transcription factor is the promoter region of 
a GAL1 gene in which the operator sequence has been replaced with 
the LexA operator sequence; and wherein said eukaryotic host is 
yeast . 

(18) The kit of claim (17) wherein said nuclear export 
signal is a peptide comprising the amino acid sequence described 
in sequence number 5. 



(19) The kit of any of claims (15) -(18) wherein said 
reporter gene is the LEU2 and/or (3-galactosidase gene. 

1/8 

Keys to Fig. 1: 
(1) Ampicillin 

2/8 

Keys to Fig. 2: 
(1) Ampicillin 

3/8 

Keys to Fig. 3: 
(1) Ampicillin 

4/8 

Keys to Fig. 4: 
(1) Ampicillin 

5/8 

Keys to Fig. 7: 
(1) Ampicillin 
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